AITopics | average speed

Collaborating Authors

average speed

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Virtual Traffic Lights for Multi-Robot Navigation: Decentralized Planning with Centralized Conflict Resolution

Gupta, Sagar, Nguyen, Thanh Vinh, Phan, Thieu Long, Attri, Vidul, Gupta, Archit, Fernando, Niroshinie, Lee, Kevin, Loke, Seng W., Kutadinata, Ronny, Champion, Benjamin, Cosgun, Akansel

arXiv.org Artificial IntelligenceNov-12-2025

We present a hybrid multi-robot coordination framework that combines decentralized path planning with centralized conflict resolution. In our approach, each robot autonomously plans its path and shares this information with a centralized node. The centralized system detects potential conflicts and allows only one of the conflicting robots to proceed at a time, instructing others to stop outside the conflicting area to avoid deadlocks. Unlike traditional centralized planning methods, our system does not dictate robot paths but instead provides stop commands, functioning as a virtual traffic light. In simulation experiments with multiple robots, our approach increased the success rate of robots reaching their goals while reducing deadlocks. Furthermore, we successfully validated the system in real-world experiments with two quadruped robots and separately with wheeled Duckiebots.

artificial intelligence, planning & scheduling, robot, (18 more...)

arXiv.org Artificial Intelligence

2511.07811

Country: Asia > Thailand (0.14)

Genre: Research Report (0.51)

Industry:

Transportation > Ground > Road (1.00)
Transportation > Infrastructure & Services (0.86)

Technology:

Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.70)

Add feedback

GPT-5 Model Corrected GPT-4V's Chart Reading Errors, Not Prompting

Yang, Kaichun, Chen, Jian

arXiv.org Artificial IntelligenceOct-9-2025

We present a quantitative evaluation to understand the effect of zero-shot large-language model (LLMs) and prompting uses on chart reading tasks. We asked LLMs to answer 107 visualization questions to compare inference accuracies between the agen-tic GPT -5 and multimodal GPT -4V, for difficult image instances, where GPT -4V failed to produce correct answers. Our results show that model architecture dominates the inference accuracy: GPT - 5 largely improved accuracy, while prompt variants yielded only small effects. Pre-registration of this work is available here; the Google Drive materials are here. Benchmarking visual literacy, i.e., "the ability and skill to read and interpret visually represented data and to extract information from data visualizations" [1] shapes progress in measuring AI's ability in handling visualization images. Often, the same tasks as designed to assess visual literacy questions traditionally performed by human observers are now being assigned to algorithms. Following this trend, our goal in this paper is to quantify the new GPT -5's ability to read charts. Specifically, we used questions where GPT -4V failed and other LLMs achieved only low accuracy, as reported in V erma et al.'s CHART -6 benchmark [2].

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2510.06782

Country: North America > United States (1.00)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Education > Educational Setting (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Hypershell Pro X Series Review: An Exoskeleton You Can Actually Buy

WIREDSep-14-2025, 13:00:00 GMT

This wearable power-up gives your legs a boost up hills, and unlike the competition, you can actually buy it, but we're not totally sure you should. All products featured on WIRED are independently selected by our editors. However, when you buy something through our retail links, we may earn an affiliate commission. Good assistance, makes hills easier on the legs. Seems to reduce muscle strain, not physical effort.

exoskeleton, heart rate, hypershell, (14 more...)

WIRED

Country:

North America > United States > California (0.04)
Europe > Slovakia (0.04)
Europe > Czechia (0.04)

Industry:

Health & Medicine > Therapeutic Area (0.59)
Health & Medicine > Consumer Health (0.49)

Technology:

Information Technology > Human Computer Interaction > Interfaces (0.55)
Information Technology > Artificial Intelligence > Assistive Technologies (0.55)

Add feedback

Adaptive Evolutionary Framework for Safe, Efficient, and Cooperative Autonomous Vehicle Interactions

Tian, Zhen, Lin, Zhihao

arXiv.org Artificial IntelligenceSep-10-2025

Modern transportation systems face significant challenges in ensuring road safety, given serious injuries caused by road accidents. The rapid growth of autonomous vehicles (AVs) has prompted new traffic designs that aim to optimize interactions among AVs. However, effective interactions between AVs remains challenging due to the absence of centralized control. Besides, there is a need for balancing multiple factors, including passenger demands and overall traffic efficiency. Traditional rule-based, optimization-based, and game-theoretic approaches each have limitations in addressing these challenges. Rule-based methods struggle with adaptability and generalization in complex scenarios, while optimization-based methods often require high computational resources. Game-theoretic approaches, such as Stackelberg and Nash games, suffer from limited adaptability and potential inefficiencies in cooperative settings. This paper proposes an Evolutionary Game Theory (EGT)-based framework for AV interactions that overcomes these limitations by utilizing a decentralized and adaptive strategy evolution mechanism. A causal evaluation module (CEGT) is introduced to optimize the evolutionary rate, balancing mutation and evolution by learning from historical interactions. Simulation results demonstrate the proposed CEGT outperforms EGT and popular benchmark games in terms of lower collision rates, improved safety distances, higher speeds, and overall better performance compared to Nash and Stackelberg games across diverse scenarios and parameter settings.

artificial intelligence, machine learning, vehicle, (19 more...)

arXiv.org Artificial Intelligence

2509.07411

Country: Europe > United Kingdom (0.28)

Genre: Research Report (0.70)

Industry:

Transportation > Ground > Road (1.00)
Automobiles & Trucks (1.00)
Transportation > Passenger (0.88)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Learning a Continue-Thinking Token for Enhanced Test-Time Scaling

Ringel, Liran, Tolochinsky, Elad, Romano, Yaniv

arXiv.org Artificial IntelligenceJun-16-2025

Test-time scaling has emerged as an effective approach for improving language model performance by utilizing additional compute at inference time. Recent studies have shown that overriding end-of-thinking tokens (e.g., replacing "" with "Wait") can extend reasoning steps and improve accuracy. In this work, we explore whether a dedicated continue-thinking token can be learned to trigger extended reasoning. We augment a distilled version of DeepSeek-R1 with a single learned "<|continue-thinking|>" token, training only its embedding via reinforcement learning while keeping the model weights frozen. Our experiments show that this learned token achieves improved accuracy on standard math benchmarks compared to both the baseline model and a test-time scaling approach that uses a fixed token (e.g., "Wait") for budget forcing. In particular, we observe that in cases where the fixed-token approach enhances the base model's accuracy, our method achieves a markedly greater improvement. For example, on the GSM8K benchmark, the fixed-token approach yields a 1.3% absolute improvement in accuracy, whereas our learned-token method achieves a 4.2% improvement over the base model that does not use budget forcing.

arxiv preprint arxiv, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2506.11274

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

Add feedback

Reinforce LLM Reasoning through Multi-Agent Reflection

Yuan, Yurun, Xie, Tengyang

arXiv.org Artificial IntelligenceJun-11-2025

Leveraging more test-time computation has proven to be an effective way to boost the reasoning capabilities of large language models (LLMs). Among various methods, the verify-and-improve paradigm stands out for enabling dynamic solution exploration and feedback incorporation. However, existing approaches often suffer from restricted feedback spaces and lack of coordinated training of different parties, leading to suboptimal performance. To address this, we model this multi-turn refinement process as a Markov Decision Process and introduce DPSDP (Direct Policy Search by Dynamic Programming), a reinforcement learning algorithm that trains an actor-critic LLM system to iteratively refine answers via direct preference learning on self-generated data. Theoretically, DPSDP can match the performance of any policy within the training distribution. Empirically, we instantiate DPSDP with various base models and show improvements on both in- and out-of-distribution benchmarks. For example, on benchmark MATH 500, majority voting over five refinement steps increases first-turn accuracy from 58.2% to 63.2% with Ministral-based models. An ablation study further confirms the benefits of multi-agent collaboration and out-of-distribution generalization.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2506.08379

Country: North America > United States > Wisconsin (0.27)

Genre: Research Report > New Finding (0.93)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.45)

Add feedback

DriveMind: A Dual-VLM based Reinforcement Learning Framework for Autonomous Driving

Wasif, Dawood, Moore, Terrence J, Reddy, Chandan K, Cho, Jin-Hee

arXiv.org Artificial IntelligenceJun-3-2025

Recent advances in autonomous vehicles have shifted development from rigid pipelines to end-to-end neural policies mapping raw sensor streams directly to control commands [1-3]. While these models offer streamlined architectures and strong benchmark performance, they raise critical deployment concerns. Their internal logic is opaque, complicating validation in safety-critical settings. They struggle to generalize to rare events like severe weather or infrastructure damage and lack formal guarantees on kinematic properties such as speed limits and lane-keeping. Further, they provide no natural interface for human oversight or explanation. These challenges motivate frameworks that combine deep network expressiveness with transparency, robustness, and provable safety. Meanwhile, Large Language Models (LLMs) and Vision Language Models (VLMs) have demonstrated human-level reasoning and visual grounding [4-6]. Recent works like VLM-SR (Shaped Rewards) [7], VLM-RM (Reward Models) [8], and RoboCLIP (Language-Conditioned Robot Learning via Contrastive Language-Image Pretraining) [9] inject semantic feedback into Reinforcement Learning (RL), but rely on static prompts unsuited to evolving road conditions and overlook vehicle dynamics.

large language model, machine learning, reinforcement learning, (19 more...)

arXiv.org Artificial Intelligence

2506.00819

Genre: Research Report (0.64)

Industry: Transportation > Ground > Road (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(2 more...)

Add feedback

Beyond the First Error: Process Reward Models for Reflective Mathematical Reasoning

Yang, Zhaohui, He, Chenghua, Shi, Xiaowen, Li, Linjing, Yin, Qiyue, Deng, Shihong, Jiang, Daxin

arXiv.org Artificial IntelligenceMay-21-2025

Many studies focus on data annotation techniques for training effective PRMs. However, current methods encounter a significant issue when applied to long CoT reasoning processes: they tend to focus solely on the first incorrect step and all preceding steps, assuming that all subsequent steps are incorrect. These methods overlook the unique self-correction and reflection mechanisms inherent in long CoT, where correct reasoning steps may still occur after initial reasoning mistakes. To address this issue, we propose a novel data annotation method for PRMs specifically designed to score the long CoT reasoning process. Given that under the reflection pattern, correct and incorrect steps often alternate, we introduce the concepts of Error Propagation and Error Cessation, enhancing PRMs' ability to identify both effective self-correction behaviors and reasoning based on erroneous steps. Leveraging an LLM-based judger for annotation, we collect 1.7 million data samples to train a 7B PRM and evaluate it at both solution and step levels. Experimental results demonstrate that compared to existing open-source PRMs and PRMs trained on open-source datasets, our PRM achieves superior performance across various metrics, including search guidance, BoN, and F1 scores. Compared to widely used MC-based annotation methods, our annotation approach not only achieves higher data efficiency but also delivers superior performance. Detailed analysis is also conducted to demonstrate the stability and generalizability of our method.

arxiv preprint arxiv, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2505.14391

Genre:

Workflow (1.00)
Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

CuRLA: Curriculum Learning Based Deep Reinforcement Learning for Autonomous Driving

Uppuluri, Bhargava, Patel, Anjel, Mehta, Neil, Kamath, Sridhar, Chakraborty, Pratyush

arXiv.org Artificial IntelligenceJan-9-2025

Deep Reinforcement Learning (DRL) agents address this by learning from experience and maximizing rewards, which helps them adapt to dynamic environments. However, ensuring their generalization remains challenging, especially with static training environments. Additionally, DRL models lack transparency, making it difficult to guarantee safety in all scenarios, particularly those not seen during training. To tackle these issues, we propose a method that combines DRL with Curriculum Learning for autonomous driving. Our approach uses a Proximal Policy Optimization (PPO) agent and a Variational Autoencoder (VAE) to learn safe driving in the CARLA simulator. The agent is trained using two-fold curriculum learning, progressively increasing environment difficulty and incorporating a collision penalty in the reward function to promote safety. This method improves the agent's adaptability and reliability in complex environments, and understand the nuances of balancing multiple reward components from different feedback signals in a single scalar reward function.

agent, learning, reward function, (14 more...)

arXiv.org Artificial Intelligence

2501.04982

Country:

North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Asia > Middle East > Jordan (0.04)
Asia > India > Telangana (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Transportation > Ground > Road (0.87)
Automobiles & Trucks (0.87)
Information Technology > Robotics & Automation (0.63)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

ProgCo: Program Helps Self-Correction of Large Language Models

Song, Xiaoshuai, Wu, Yanan, Wang, Weixun, Liu, Jiaheng, Su, Wenbo, Zheng, Bo

arXiv.org Artificial IntelligenceJan-2-2025

Self-Correction aims to enable large language models (LLMs) to self-verify and self-refine their initial responses without external feedback. However, LLMs often fail to effectively self-verify and generate correct feedback, further misleading refinement and leading to the failure of self-correction, especially in complex reasoning tasks. In this paper, we propose Program-driven Self-Correction (ProgCo). First, program-driven verification (ProgVe) achieves complex verification logic and extensive validation through self-generated, self-executing verification pseudo-programs. Then, program-driven refinement (ProgRe) receives feedback from ProgVe, conducts dual reflection and refinement on both responses and verification programs to mitigate misleading of incorrect feedback in complex reasoning tasks. Experiments on three instruction-following and mathematical benchmarks indicate that ProgCo achieves effective self-correction, and can be further enhance performance when combined with real program tools.

average speed, progco, reasoning, (15 more...)

arXiv.org Artificial Intelligence

2501.01264

Country:

Asia > Thailand > Bangkok > Bangkok (0.04)
North America > Mexico > Mexico City > Mexico City (0.04)
North America > United States > Florida > Miami-Dade County > Miami (0.04)

Genre:

Research Report (0.50)
Workflow (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback